Overview

Dataset statistics

Number of variables12
Number of observations1599
Missing cells0
Missing cells (%)0.0%
Duplicate rows220
Duplicate rows (%)13.8%
Total size in memory150.0 KiB
Average record size in memory96.1 B

Variable types

Numeric12

Alerts

Dataset has 220 (13.8%) duplicate rowsDuplicates
fixed acidity is highly correlated with citric acid and 3 other fieldsHigh correlation
volatile acidity is highly correlated with citric acidHigh correlation
citric acid is highly correlated with fixed acidity and 3 other fieldsHigh correlation
free sulfur dioxide is highly correlated with residual sugar and 1 other fieldsHigh correlation
total sulfur dioxide is highly correlated with free sulfur dioxideHigh correlation
density is highly correlated with fixed acidity and 3 other fieldsHigh correlation
pH is highly correlated with fixed acidity and 2 other fieldsHigh correlation
residual sugar is highly correlated with free sulfur dioxide and 1 other fieldsHigh correlation
chlorides is highly correlated with citric acid and 1 other fieldsHigh correlation
sulphates is highly correlated with citric acid and 1 other fieldsHigh correlation
alcohol is highly correlated with fixed acidity and 1 other fieldsHigh correlation
citric acid has 132 (8.3%) zeros Zeros

Reproduction

Analysis started2022-11-24 06:57:49.692000
Analysis finished2022-11-24 06:58:18.382271
Duration28.69 seconds
Software versionpandas-profiling v3.4.0
Download configurationconfig.json

Variables

fixed acidity
Real number (ℝ≥0)

HIGH CORRELATION

Distinct96
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.319637273
Minimum4.6
Maximum15.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.6 KiB
2022-11-24T13:58:18.545383image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum4.6
5-th percentile6.1
Q17.1
median7.9
Q39.2
95-th percentile11.8
Maximum15.9
Range11.3
Interquartile range (IQR)2.1

Descriptive statistics

Standard deviation1.741096318
Coefficient of variation (CV)0.2092755082
Kurtosis1.132143398
Mean8.319637273
Median Absolute Deviation (MAD)1
Skewness0.9827514413
Sum13303.1
Variance3.031416389
MonotonicityNot monotonic
2022-11-24T13:58:18.728970image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.267
 
4.2%
7.157
 
3.6%
7.853
 
3.3%
7.552
 
3.3%
750
 
3.1%
7.749
 
3.1%
6.846
 
2.9%
7.646
 
2.9%
8.245
 
2.8%
7.444
 
2.8%
Other values (86)1090
68.2%
ValueCountFrequency (%)
4.61
 
0.1%
4.71
 
0.1%
4.91
 
0.1%
56
0.4%
5.14
 
0.3%
5.26
0.4%
5.34
 
0.3%
5.45
 
0.3%
5.51
 
0.1%
5.614
0.9%
ValueCountFrequency (%)
15.91
0.1%
15.62
0.1%
15.52
0.1%
152
0.1%
14.31
0.1%
141
0.1%
13.81
0.1%
13.72
0.1%
13.51
0.1%
13.41
0.1%

volatile acidity
Real number (ℝ≥0)

HIGH CORRELATION

Distinct143
Distinct (%)8.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5278205128
Minimum0.12
Maximum1.58
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.6 KiB
2022-11-24T13:58:18.923397image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.12
5-th percentile0.27
Q10.39
median0.52
Q30.64
95-th percentile0.84
Maximum1.58
Range1.46
Interquartile range (IQR)0.25

Descriptive statistics

Standard deviation0.1790597042
Coefficient of variation (CV)0.3392435493
Kurtosis1.22554225
Mean0.5278205128
Median Absolute Deviation (MAD)0.12
Skewness0.6715925724
Sum843.985
Variance0.03206237765
MonotonicityNot monotonic
2022-11-24T13:58:19.112605image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.647
 
2.9%
0.546
 
2.9%
0.4343
 
2.7%
0.5939
 
2.4%
0.3638
 
2.4%
0.5838
 
2.4%
0.437
 
2.3%
0.3835
 
2.2%
0.4935
 
2.2%
0.3935
 
2.2%
Other values (133)1206
75.4%
ValueCountFrequency (%)
0.123
 
0.2%
0.162
 
0.1%
0.1810
0.6%
0.192
 
0.1%
0.23
 
0.2%
0.216
0.4%
0.226
0.4%
0.235
 
0.3%
0.2413
0.8%
0.257
0.4%
ValueCountFrequency (%)
1.581
 
0.1%
1.332
0.1%
1.241
 
0.1%
1.1851
 
0.1%
1.181
 
0.1%
1.131
 
0.1%
1.1151
 
0.1%
1.091
 
0.1%
1.071
 
0.1%
1.043
0.2%

citric acid
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct80
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2709756098
Minimum0
Maximum1
Zeros132
Zeros (%)8.3%
Negative0
Negative (%)0.0%
Memory size12.6 KiB
2022-11-24T13:58:19.302424image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.09
median0.26
Q30.42
95-th percentile0.6
Maximum1
Range1
Interquartile range (IQR)0.33

Descriptive statistics

Standard deviation0.1948011374
Coefficient of variation (CV)0.7188880858
Kurtosis-0.7889975154
Mean0.2709756098
Median Absolute Deviation (MAD)0.17
Skewness0.3183372953
Sum433.29
Variance0.03794748313
MonotonicityNot monotonic
2022-11-24T13:58:19.489376image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0132
 
8.3%
0.4968
 
4.3%
0.2451
 
3.2%
0.0250
 
3.1%
0.2638
 
2.4%
0.135
 
2.2%
0.0833
 
2.1%
0.2133
 
2.1%
0.0133
 
2.1%
0.3232
 
2.0%
Other values (70)1094
68.4%
ValueCountFrequency (%)
0132
8.3%
0.0133
 
2.1%
0.0250
 
3.1%
0.0330
 
1.9%
0.0429
 
1.8%
0.0520
 
1.3%
0.0624
 
1.5%
0.0722
 
1.4%
0.0833
 
2.1%
0.0930
 
1.9%
ValueCountFrequency (%)
11
 
0.1%
0.791
 
0.1%
0.781
 
0.1%
0.763
0.2%
0.751
 
0.1%
0.744
0.3%
0.733
0.2%
0.721
 
0.1%
0.711
 
0.1%
0.72
0.1%

residual sugar
Real number (ℝ≥0)

HIGH CORRELATION

Distinct91
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.538805503
Minimum0.9
Maximum15.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.6 KiB
2022-11-24T13:58:19.671813image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.9
5-th percentile1.59
Q11.9
median2.2
Q32.6
95-th percentile5.1
Maximum15.5
Range14.6
Interquartile range (IQR)0.7

Descriptive statistics

Standard deviation1.40992806
Coefficient of variation (CV)0.5553509545
Kurtosis28.61759542
Mean2.538805503
Median Absolute Deviation (MAD)0.3
Skewness4.540655426
Sum4059.55
Variance1.987897133
MonotonicityNot monotonic
2022-11-24T13:58:19.855130image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2156
 
9.8%
2.2131
 
8.2%
1.8129
 
8.1%
2.1128
 
8.0%
1.9117
 
7.3%
2.3109
 
6.8%
2.486
 
5.4%
2.584
 
5.3%
2.679
 
4.9%
1.776
 
4.8%
Other values (81)504
31.5%
ValueCountFrequency (%)
0.92
 
0.1%
1.28
 
0.5%
1.35
 
0.3%
1.435
 
2.2%
1.530
 
1.9%
1.658
3.6%
1.652
 
0.1%
1.776
4.8%
1.752
 
0.1%
1.8129
8.1%
ValueCountFrequency (%)
15.51
0.1%
15.42
0.1%
13.91
0.1%
13.82
0.1%
13.41
0.1%
12.91
0.1%
112
0.1%
10.71
0.1%
91
0.1%
8.91
0.1%

chlorides
Real number (ℝ≥0)

HIGH CORRELATION

Distinct153
Distinct (%)9.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.08746654159
Minimum0.012
Maximum0.611
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.6 KiB
2022-11-24T13:58:20.052063image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.012
5-th percentile0.054
Q10.07
median0.079
Q30.09
95-th percentile0.1261
Maximum0.611
Range0.599
Interquartile range (IQR)0.02

Descriptive statistics

Standard deviation0.04706530201
Coefficient of variation (CV)0.5380949236
Kurtosis41.71578725
Mean0.08746654159
Median Absolute Deviation (MAD)0.01
Skewness5.680346572
Sum139.859
Variance0.002215142653
MonotonicityNot monotonic
2022-11-24T13:58:20.365506image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0866
 
4.1%
0.07455
 
3.4%
0.07651
 
3.2%
0.07851
 
3.2%
0.08449
 
3.1%
0.07147
 
2.9%
0.07747
 
2.9%
0.08246
 
2.9%
0.07545
 
2.8%
0.07943
 
2.7%
Other values (143)1099
68.7%
ValueCountFrequency (%)
0.0122
 
0.1%
0.0341
 
0.1%
0.0382
 
0.1%
0.0394
0.3%
0.0414
0.3%
0.0423
0.2%
0.0431
 
0.1%
0.0445
0.3%
0.0454
0.3%
0.0464
0.3%
ValueCountFrequency (%)
0.6111
 
0.1%
0.611
 
0.1%
0.4671
 
0.1%
0.4641
 
0.1%
0.4221
 
0.1%
0.4153
0.2%
0.4142
0.1%
0.4131
 
0.1%
0.4031
 
0.1%
0.4011
 
0.1%

free sulfur dioxide
Real number (ℝ≥0)

HIGH CORRELATION

Distinct60
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.87492183
Minimum1
Maximum72
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.6 KiB
2022-11-24T13:58:20.558442image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q17
median14
Q321
95-th percentile35
Maximum72
Range71
Interquartile range (IQR)14

Descriptive statistics

Standard deviation10.46015697
Coefficient of variation (CV)0.6589107704
Kurtosis2.023562046
Mean15.87492183
Median Absolute Deviation (MAD)7
Skewness1.250567293
Sum25384
Variance109.4148838
MonotonicityNot monotonic
2022-11-24T13:58:20.737446image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6138
 
8.6%
5104
 
6.5%
1079
 
4.9%
1578
 
4.9%
1275
 
4.7%
771
 
4.4%
962
 
3.9%
1661
 
3.8%
1760
 
3.8%
1159
 
3.7%
Other values (50)812
50.8%
ValueCountFrequency (%)
13
 
0.2%
21
 
0.1%
349
 
3.1%
441
 
2.6%
5104
6.5%
5.51
 
0.1%
6138
8.6%
771
4.4%
856
3.5%
962
3.9%
ValueCountFrequency (%)
721
 
0.1%
682
0.1%
661
 
0.1%
571
 
0.1%
552
0.1%
541
 
0.1%
531
 
0.1%
523
0.2%
514
0.3%
502
0.1%

total sulfur dioxide
Real number (ℝ≥0)

HIGH CORRELATION

Distinct144
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46.46779237
Minimum6
Maximum289
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.6 KiB
2022-11-24T13:58:20.939465image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile11
Q122
median38
Q362
95-th percentile112.1
Maximum289
Range283
Interquartile range (IQR)40

Descriptive statistics

Standard deviation32.89532448
Coefficient of variation (CV)0.7079166623
Kurtosis3.809824488
Mean46.46779237
Median Absolute Deviation (MAD)18
Skewness1.515531258
Sum74302
Variance1082.102373
MonotonicityNot monotonic
2022-11-24T13:58:21.122347image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2843
 
2.7%
2436
 
2.3%
1535
 
2.2%
1835
 
2.2%
2334
 
2.1%
1433
 
2.1%
2033
 
2.1%
3132
 
2.0%
3831
 
1.9%
2730
 
1.9%
Other values (134)1257
78.6%
ValueCountFrequency (%)
63
 
0.2%
74
 
0.3%
814
 
0.9%
914
 
0.9%
1027
1.7%
1126
1.6%
1229
1.8%
1328
1.8%
1433
2.1%
1535
2.2%
ValueCountFrequency (%)
2891
0.1%
2781
0.1%
1651
0.1%
1601
0.1%
1551
0.1%
1531
0.1%
1521
0.1%
1512
0.1%
1491
0.1%
1482
0.1%

density
Real number (ℝ≥0)

HIGH CORRELATION

Distinct436
Distinct (%)27.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9967466792
Minimum0.99007
Maximum1.00369
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.6 KiB
2022-11-24T13:58:21.309561image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.99007
5-th percentile0.993598
Q10.9956
median0.99675
Q30.997835
95-th percentile1
Maximum1.00369
Range0.01362
Interquartile range (IQR)0.002235

Descriptive statistics

Standard deviation0.001887333954
Coefficient of variation (CV)0.001893494098
Kurtosis0.9340790655
Mean0.9967466792
Median Absolute Deviation (MAD)0.00113
Skewness0.07128766295
Sum1593.79794
Variance3.562029453 × 10-6
MonotonicityNot monotonic
2022-11-24T13:58:21.508407image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.997236
 
2.3%
0.996835
 
2.2%
0.997635
 
2.2%
0.99829
 
1.8%
0.996228
 
1.8%
0.997826
 
1.6%
0.996425
 
1.6%
0.999424
 
1.5%
0.99724
 
1.5%
0.998223
 
1.4%
Other values (426)1314
82.2%
ValueCountFrequency (%)
0.990072
0.1%
0.99021
0.1%
0.990642
0.1%
0.99081
0.1%
0.990841
0.1%
0.99121
0.1%
0.99151
0.1%
0.991541
0.1%
0.991571
0.1%
0.99162
0.1%
ValueCountFrequency (%)
1.003692
0.1%
1.00321
 
0.1%
1.003153
0.2%
1.002891
 
0.1%
1.00262
0.1%
1.002422
0.1%
1.00222
0.1%
1.00212
0.1%
1.00181
 
0.1%
1.00152
0.1%

pH
Real number (ℝ≥0)

HIGH CORRELATION

Distinct89
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.311113196
Minimum2.74
Maximum4.01
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.6 KiB
2022-11-24T13:58:21.720916image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum2.74
5-th percentile3.06
Q13.21
median3.31
Q33.4
95-th percentile3.57
Maximum4.01
Range1.27
Interquartile range (IQR)0.19

Descriptive statistics

Standard deviation0.1543864649
Coefficient of variation (CV)0.04662675535
Kurtosis0.8069425082
Mean3.311113196
Median Absolute Deviation (MAD)0.1
Skewness0.1936834981
Sum5294.47
Variance0.02383518055
MonotonicityNot monotonic
2022-11-24T13:58:21.911138image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.357
 
3.6%
3.3656
 
3.5%
3.2653
 
3.3%
3.3848
 
3.0%
3.3948
 
3.0%
3.2946
 
2.9%
3.3245
 
2.8%
3.3443
 
2.7%
3.2842
 
2.6%
3.239
 
2.4%
Other values (79)1122
70.2%
ValueCountFrequency (%)
2.741
 
0.1%
2.861
 
0.1%
2.871
 
0.1%
2.882
0.1%
2.894
0.3%
2.91
 
0.1%
2.924
0.3%
2.933
0.2%
2.944
0.3%
2.951
 
0.1%
ValueCountFrequency (%)
4.012
0.1%
3.92
0.1%
3.851
 
0.1%
3.782
0.1%
3.751
 
0.1%
3.741
 
0.1%
3.723
0.2%
3.714
0.3%
3.71
 
0.1%
3.694
0.3%

sulphates
Real number (ℝ≥0)

HIGH CORRELATION

Distinct96
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.658148843
Minimum0.33
Maximum2
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.6 KiB
2022-11-24T13:58:22.108793image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.33
5-th percentile0.47
Q10.55
median0.62
Q30.73
95-th percentile0.93
Maximum2
Range1.67
Interquartile range (IQR)0.18

Descriptive statistics

Standard deviation0.1695069796
Coefficient of variation (CV)0.2575511321
Kurtosis11.72025073
Mean0.658148843
Median Absolute Deviation (MAD)0.08
Skewness2.428672354
Sum1052.38
Variance0.02873261613
MonotonicityNot monotonic
2022-11-24T13:58:22.284286image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.669
 
4.3%
0.5868
 
4.3%
0.5468
 
4.3%
0.6261
 
3.8%
0.5660
 
3.8%
0.5755
 
3.4%
0.5351
 
3.2%
0.5951
 
3.2%
0.5550
 
3.1%
0.6348
 
3.0%
Other values (86)1018
63.7%
ValueCountFrequency (%)
0.331
 
0.1%
0.372
 
0.1%
0.396
 
0.4%
0.44
 
0.3%
0.425
 
0.3%
0.438
0.5%
0.4416
1.0%
0.4512
0.8%
0.4618
1.1%
0.4719
1.2%
ValueCountFrequency (%)
21
 
0.1%
1.981
 
0.1%
1.952
0.1%
1.621
 
0.1%
1.611
 
0.1%
1.591
 
0.1%
1.561
 
0.1%
1.363
0.2%
1.341
 
0.1%
1.331
 
0.1%

alcohol
Real number (ℝ≥0)

HIGH CORRELATION

Distinct65
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.42298311
Minimum8.4
Maximum14.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.6 KiB
2022-11-24T13:58:22.473385image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum8.4
5-th percentile9.2
Q19.5
median10.2
Q311.1
95-th percentile12.5
Maximum14.9
Range6.5
Interquartile range (IQR)1.6

Descriptive statistics

Standard deviation1.065667582
Coefficient of variation (CV)0.1022420904
Kurtosis0.2000293113
Mean10.42298311
Median Absolute Deviation (MAD)0.7
Skewness0.8608288069
Sum16666.35
Variance1.135647395
MonotonicityNot monotonic
2022-11-24T13:58:22.649012image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9.5139
 
8.7%
9.4103
 
6.4%
9.878
 
4.9%
9.272
 
4.5%
1067
 
4.2%
10.567
 
4.2%
9.359
 
3.7%
1159
 
3.7%
9.659
 
3.7%
9.754
 
3.4%
Other values (55)842
52.7%
ValueCountFrequency (%)
8.42
 
0.1%
8.51
 
0.1%
8.72
 
0.1%
8.82
 
0.1%
930
1.9%
9.051
 
0.1%
9.123
 
1.4%
9.272
4.5%
9.2333333331
 
0.1%
9.251
 
0.1%
ValueCountFrequency (%)
14.91
 
0.1%
147
0.4%
13.64
0.3%
13.566666671
 
0.1%
13.51
 
0.1%
13.43
0.2%
13.33
0.2%
13.21
 
0.1%
13.12
 
0.1%
136
0.4%

quality
Real number (ℝ≥0)

Distinct6
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.636022514
Minimum3
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.6 KiB
2022-11-24T13:58:22.941355image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile5
Q15
median6
Q36
95-th percentile7
Maximum8
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8075694397
Coefficient of variation (CV)0.143287121
Kurtosis0.2967081198
Mean5.636022514
Median Absolute Deviation (MAD)1
Skewness0.2178015755
Sum9012
Variance0.6521684
MonotonicityNot monotonic
2022-11-24T13:58:23.046652image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
5681
42.6%
6638
39.9%
7199
 
12.4%
453
 
3.3%
818
 
1.1%
310
 
0.6%
ValueCountFrequency (%)
310
 
0.6%
453
 
3.3%
5681
42.6%
6638
39.9%
7199
 
12.4%
818
 
1.1%
ValueCountFrequency (%)
818
 
1.1%
7199
 
12.4%
6638
39.9%
5681
42.6%
453
 
3.3%
310
 
0.6%

Interactions

2022-11-24T13:58:15.648319image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:52.667031image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:54.557862image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:56.326900image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:58.118988image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:59.850844image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:01.674511image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:03.531708image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:05.883103image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:08.761779image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:11.009500image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:13.341689image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:15.853807image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:52.885453image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:54.697295image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:56.456496image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:58.274974image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:00.005702image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:01.817188image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:03.685143image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:06.115164image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:08.980131image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:11.179759image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:13.529119image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:16.045912image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:53.018951image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:54.858352image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:56.607142image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:58.412887image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:00.147046image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:01.969972image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:03.984799image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:06.336006image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:09.196667image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:11.362584image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:13.707419image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:16.240784image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:53.147409image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:54.996756image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:56.746585image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:58.549254image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:00.284351image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:02.116189image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:04.163467image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:06.547803image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:09.407191image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:11.535062image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:13.878049image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:16.428974image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:53.285522image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:55.160821image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:56.882317image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:58.688470image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:00.557194image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:02.261559image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:04.315018image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:06.781798image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:09.578089image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:11.710938image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:14.062296image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:16.597985image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:53.414614image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:55.310484image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:57.020026image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:58.829210image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:00.703222image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:02.399631image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:04.481821image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:07.014766image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:09.758709image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:11.890608image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:14.256403image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:16.757518image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:53.555503image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:55.455381image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:57.282252image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:58.981657image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:00.852670image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:02.539891image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:04.635366image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:07.243011image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:09.974303image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:12.065390image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:14.434494image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:16.916571image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:53.731315image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:55.600169image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:57.425822image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:59.126652image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:00.981261image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:02.687143image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:04.790344image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:07.462245image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:10.153977image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:12.240145image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:14.618524image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:17.088396image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:54.031554image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:55.743917image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:57.568861image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:59.271230image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:01.126764image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:02.843698image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:04.985852image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:07.693141image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:10.331594image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:12.430229image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:14.817707image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:17.393052image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:54.173198image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:55.892877image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:57.697813image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:59.407330image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:01.265152image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:02.997887image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:05.208295image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:07.921012image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:10.500826image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:12.615083image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:14.999907image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:17.550554image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:54.300947image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:56.042717image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:57.838992image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:59.547324image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:01.402236image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:03.209295image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:05.429993image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:08.150357image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:10.669987image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:12.981409image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:15.202611image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:17.711343image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:54.430823image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:56.180764image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:57.980389image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:57:59.688136image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:01.538381image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:03.382955image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:05.667108image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:08.371880image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:10.848724image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:13.168256image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-24T13:58:15.401675image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-11-24T13:58:23.176211image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Auto

The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.
2022-11-24T13:58:23.382834image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-24T13:58:23.601500image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-24T13:58:23.827113image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-24T13:58:24.047781image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-24T13:58:17.968305image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-24T13:58:18.266865image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

fixed acidityvolatile aciditycitric acidresidual sugarchloridesfree sulfur dioxidetotal sulfur dioxidedensitypHsulphatesalcoholquality
07.40.700.001.90.07611.034.00.99783.510.569.45
17.80.880.002.60.09825.067.00.99683.200.689.85
27.80.760.042.30.09215.054.00.99703.260.659.85
311.20.280.561.90.07517.060.00.99803.160.589.86
47.40.700.001.90.07611.034.00.99783.510.569.45
57.40.660.001.80.07513.040.00.99783.510.569.45
67.90.600.061.60.06915.059.00.99643.300.469.45
77.30.650.001.20.06515.021.00.99463.390.4710.07
87.80.580.022.00.0739.018.00.99683.360.579.57
97.50.500.366.10.07117.0102.00.99783.350.8010.55

Last rows

fixed acidityvolatile aciditycitric acidresidual sugarchloridesfree sulfur dioxidetotal sulfur dioxidedensitypHsulphatesalcoholquality
15896.60.7250.207.80.07329.079.00.997703.290.549.25
15906.30.5500.151.80.07726.035.00.993143.320.8211.66
15915.40.7400.091.70.08916.026.00.994023.670.5611.66
15926.30.5100.132.30.07629.040.00.995743.420.7511.06
15936.80.6200.081.90.06828.038.00.996513.420.829.56
15946.20.6000.082.00.09032.044.00.994903.450.5810.55
15955.90.5500.102.20.06239.051.00.995123.520.7611.26
15966.30.5100.132.30.07629.040.00.995743.420.7511.06
15975.90.6450.122.00.07532.044.00.995473.570.7110.25
15986.00.3100.473.60.06718.042.00.995493.390.6611.06

Duplicate rows

Most frequently occurring

fixed acidityvolatile aciditycitric acidresidual sugarchloridesfree sulfur dioxidetotal sulfur dioxidedensitypHsulphatesalcoholquality# duplicates
226.70.4600.241.70.07718.034.00.994803.390.6010.664
527.20.3600.462.10.07424.044.00.995343.400.8511.074
637.20.6950.132.00.07612.020.00.995463.290.5410.154
817.50.5100.021.70.08413.031.00.995383.360.5410.564
56.00.5000.001.40.05715.026.00.994483.360.459.553
126.40.6400.211.80.08114.031.00.996893.590.669.853
397.00.6500.022.10.0668.025.00.997203.470.679.563
407.00.6900.072.50.09115.021.00.995723.380.6011.363
607.20.6300.001.90.09714.038.00.996753.370.589.063
1047.80.6000.262.00.08031.0131.00.996223.210.529.953